weighting approach
Effects of term weighting approach with and without stop words removing on Arabic text classification
Alhenawi, Esra'a, Khurma, Ruba Abu, Castillo, Pedro A., Arenas, Maribel G.
Classifying text is a method for categorizing documents into pre-established groups. Text documents must be prepared and represented in a way that is appropriate for the algorithms used for data mining prior to classification. As a result, a number of term weighting strategies have been created in the literature to enhance text categorization algorithms' functionality. This study compares the effects of Binary and Term frequency weighting feature methodologies on the text's classification method when stop words are eliminated once and when they are not. In recognition of assessing the effects of prior weighting of features approaches on classification results in terms of accuracy, recall, precision, and F-measure values, we used an Arabic data set made up of 322 documents divided into six main topics (agriculture, economy, health, politics, science, and sport), each of which contains 50 documents, with the exception of the health category, which contains 61 documents. The results demonstrate that for all metrics, the term frequency feature weighting approach with stop word removal outperforms the binary approach, while for accuracy, recall, and F-Measure, the binary approach outperforms the TF approach without stop word removal. However, for precision, the two approaches produce results that are very similar. Additionally, it is clear from the data that, using the same phrase weighting approach, stop word removing increases classification accuracy.
- Asia > Middle East > Jordan > Amman Governorate > Amman (0.05)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.72)
Unsupervised Personalization of an Emotion Recognition System: The Unique Properties of the Externalization of Valence in Speech
Abstract--The prediction of valence from speech is an important, but challenging problem. The externalization of valence in speech has speaker-dependent cues, which contribute to performances that are often significantly lower than the prediction of other emotional attributes such as arousal and dominance. A practical approach to improve valence prediction from speech is to adapt the models to the target speakers in the test set. Adapting a speech emotion recognition (SER) system to a particular speaker is a hard problem, especially with deep neural networks (DNNs), since it requires optimizing millions of parameters. This study proposes an unsupervised approach to address this problem by searching for speakers in the train set with similar acoustic patterns as the speaker in the test set. Speech samples from the selected speakers are used to create the adaptation set. This approach leverages transfer learning using pre-trained models, which are adapted with these speech samples. We propose three alternative adaptation strategies: unique speaker, oversampling and weighting approaches. These methods differ on the use of the adaptation set in the personalization of the valence models. The results demonstrate that a valence prediction model can be efficiently personalized with these unsupervised approaches, leading to relative improvements as high as 13.52%. Index Terms--Speech emotion recognition, adaptation, transfer learning, emotional dimensions, valence. In potential in fields such as human-computer interactions particular, the emotional attribute valence is key (HCIs), healthcare [1], [2] and behavioral studies to understand many behavioral disorders [6], [7] [3], [4]. Although different is still a challenging task. The usual formulation approaches have been proposed to improve SER to describe emotions is with categorical descriptors systems, the prediction of valence using acoustic such as happiness, sadness, anger and neutral.
- Europe > United Kingdom > England > East Sussex > Brighton (0.14)
- Europe > Austria > Vienna (0.14)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- (17 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
An Outcome Model Approach to Translating a Randomized Controlled Trial Results to a Target Population
Goldstein, Benjamin A., Phelan, Matthew, Pagidipati, Neha J., Holman, Rury R., Stuart, Michael J. Pencina Elizabeth A
ACKNOWLEDGMENTS We thank the NAVIGATOR steering committee and investigators for access to the NAVIGATOR data Affiliations: Department of Biostatistics & Bioinformatics, Duke University, Durham, NC (BAG, MJP); Center For Predictive Medicine, Duke Clinical Research Institute, Durham, NC (BAG, MP, NHJ); Department of Medicine, Duke University, Durham, NC (NHJ); Diabetes Trials Unit, Oxford Centre for Diabetes, Endocrinology, and Metabolism, University of Oxford, Oxford (RRH); Department of Biostatistics, Johns Hopkins University, Baltimore, MD (EAS) Funding: This work was supported by National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) career development award K25 DK097279 (B.A.G.), US Department of Education Institute of Education Sciences Grant R305D150003 (EAS). The project described was supported by the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), through Grant Award Number UL1TR001117 at Duke University. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. NAVIGATOR was funded by Novartis. An Outcome Model Approach to Translating a Randomized Controlled Trial Results to a Target Population Abstract Participants enrolled into randomized controlled trials (RCTs) often do not reflect real-world populations. Previous research in how best to translate RCT results to target populations has focused on weighting RCT data to look like the target data. Simulation work, however, has suggested that an outcome model approach may be preferable. Here we describe such an approach using source data from the 2x2 factorial NAVIGATOR trial which evaluated the impact of valsartan and nateglinide on cardiovascular outcomes and new-onset diabetes in a "pre-diabetic" population. Our target data consisted of people with "pre-diabetes" serviced at our institution. We used Random Survival Forests to develop separate outcome models for each of the 4 treatments, estimating the 5-year risk difference for progression to diabetes and estimated the treatment effect in our local patient populations, as well as subpopulations, and the results compared to the traditional weighting approach.
- North America > United States > North Carolina > Durham County > Durham (0.65)
- North America > United States > Maryland > Baltimore (0.24)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.24)
- North America > United States > New York (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)